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ABSTRACT 

Wc conduct a detailed analysis of the photometric rcdshift requirements for the pro- 
posed Dark Energy Survey (DES) using two sets of mock galaxy simulations and an 
artificial neural network code - ANNz. In particular, we examine how optical photom- 
etry in the DES grizY bands can be complemented with near infra-red photometry 
from the planned VISTA Hemisphere Survey (VHS) in the JHKg bands. We find that 
the rms scatter on the photometric redshift estimate over 1 < z < 2 is a^—Q-'^ from 
DES alone and crz=0.15 from DES-I- VISTA, i.e. an improvement of more than 30%. 
We draw attention to the effects of galaxy formation scenarios such as reddening on 
the photo-z estimate and using our neural network code, calculate the extinction. Ay 
for these reddened galaxies. We also look at the impact of using different training sets 
when calculating photometric redshifts. In particular, we find that using the ongoing 
DEEP2 and VVDS-Deep spectroscopic surveys to calibrate photometric redshifts for 
DES, will prove effective. However we need to be aware of uncertainties in the photo- 
metric redshift bias that arise when using different training sets as these will translate 
into errors in the dark energy equation of state parameter, w. Furthermore, we show 
that the neural network error estimate on the photometric redshift may be used to re- 
move outliers from our samples before any kind of cosmological analysis, in particular 
for large-scale structure experiments. By removing all galaxies with a neural network 
photo-z error estimate of greater than 0.1 from our DES-I- VHS sample, we can con- 
strain the galaxy power spectrum out to a redshift of 2 and reduce the fractional error 
on this power spectrum by ^15-20% compared to using the entire catalogue. 

Output tables of spectroscopic redshift versus photometric redshift used to 
produce the results in this paper can be found at www.star.ucl.ac.uk/ ^ 
mbanerji/ DESdata. 

Key viTords: Cosmology: Photometric redshift surveys - Dark Energy 



1 INTRODUCTION 

It is now widely accepted that dark energy is responsible 
for driving the observed acceleration of the Universe. In re- 
cent years, measuring and constraining the nature of this 
dark energy has become a central focus of current studies in 
cosmology. Several different methods have been developed 
and shown to probe the nature of dark energy through its 
effects on the geometry and s tructure of the Universe e.g . 
iRiess et al.l (|l998l ): [HuI (|l999l ): iBlake fc Glazebroo^ (|2003l ). 
Large-scale sky surveys such as the Sloan Digital Sky Sur- 
vey have no doubt aided this kind of study (|Tegmark et al.l 
I2OO4I : ISpergel et al.|[2007l ) and more such galaxy surveys are 
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now being planned to exploit the different techniques that 
will help us better understand the nature of dark energy. 

The proposed Dark Energy Survey is one such exper- 
iment. It will use four independent probes namely galaxy 
clusters, galaxy power spectrum measurements, weak lens- 
ing studies and a supernova survey to constrain the nature 
of dark energy. Each of these methods relies on accurate 
distance measurements extending over cosmological scales. 
Given the wealth of data that will be available to us from 
such surveys, measuring distances and redshifts for all the 
objects using spectroscopic methods clearly becomes unfea- 
sible. Hence the need for photometric redshifts. 

Photometric redshift estimation methods have been 
around since the 1960s but have undergone a recent revival 
with proposals for a new generation of large-scale photo- 
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metric surveys such as the Dark Energy Survey. New algo- 
rithms have been developed for p hoto-z estimation and mad e 
availabl e to the c ommu n ity e.g. ICoUister fc La hav^ (^20 04h: 
iBolzoncUa et alj (l20odl) : iBemtej (|2000l '): iFeTdmann ct al] 
l|2006l ): lBabbedge et all (|2004 ). ^'urthermore, there is cur- 
rently a lot of emphasis on optimising the depth and number 
of bands that will be used to image galaxies in future galaxy 
surveys so as to obtain accurate photometric redshifts. It is 
widely known that imaging in more bands can help reduce 
errors on photometric redshifts but the costs of adding more 
filters to planned surveys are significant. Given the plethora 
of data that is now becoming available to us covering the 
whole range of the electromagnetic bands and a large por- 
tion of the observable sky, it is vital that we explore the 
overlap between different surveys and fully exploit the data 
sets available to us, in order to achieve the best compromise 
between cost and science. 

In this paper we analyse the prospect of combining op- 
tical data from the Dark Energy Survey (DES) with near 
infra-red data from the Vista Hemisphere Survey (VHS) in 
order to obtain accurate photometric redshifts that will help 
us better constrain the nature of dark energy. We begin with 
a brief description of these two proposed surveys and a de- 
scription of the method used to generate mock galaxy sam- 
ples for both of these surveys. We then proceed to a full 
photometric redshift analysis of simulated data from these 
two surveys using artificial neural networks. We assess the 
impact of reddening on our photometric redshift estimate as 
well as the effects of removing outliers and using different 
training sets. In each case, we present results obtained for 
the optical data from DES only and for optical and near- 
infra red data from DES and VHS. Finally, we look at the 
implications of our results for cosmological constraints on 
dark energy. In particular, we concentrate on the impact 
of photometric redshift errors on constraints on dark energy 
using galaxy power spectrum measurements. All magnitudes 
quoted in this paper are in the AB system. 



2 THE DARK ENERGY SURVEY (DES) 

The Dark Energy Survey is a proposed ground-based photo- 
metric survey that will image 5000deg^ of the South Galactic 
Cap in the optical griz bands as well as the F-band. The 
survey will be carried out using the Blanco 4-m telescope 
at the Cerro Tololo Inter- American Observatory (CTIO) in 
Chile. The main objectives of the survey are to extract in- 
formation on the nature and density of dark energy and 
dark matter using galaxy clusters, galaxy power spectrum 
measurements, weak lensing studies and a supernova sur- 
vey. This will be achieved by measuring redshifts of some 
300 million galaxies in the redshift range < z < 2, tens 
of thousands of clusters in the redshift range < z < 1.1 
and about 2000 T ype la supernovae in the redshift range 
0.3 < z < 0.75 (|The Dark Energv Survev Collaboration! 
I2OO5I ). Observations will be carried out over 525 nights 
spread over five years between 2010 and 2014 and when com- 
pleted, DES will provide a legacy archive of data extending 
around two magnitudes deeper than the Sloan Digital Sky 
Survey which is currently the largest existing CCD survey of 
the Universe by volume. We have estimated the DES volume 
to be 23.74/i~^Gpc^ in the range < 2; < 2, about ten times 
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Table 1. Areas and lOcr magnitude limits for the surveys dis- 
cussed in this work. The magnitudes arc in the AB system. 



that of the SDSS LRG sample (|Blake et al.ll2007l ). This is 
assuming a IOct AB magnitude limit of r < 24. 

The DES survey area overlaps with that of several other 
important current and future surveys for example the south- 
ern equatorial strip of the Sloan Digital Sky Survey and the 
South Pole Telescope SZE cluster survey. The entire DES re- 
gion will also be imaged in the near infra-red bands on two 
public surveys being conducted on the Visible and Infra-Red 
Survey Telescope for Astronomy (VISTA) at ESO's Cerro 
Paranal Observatory in Chile. 



3 THE VISTA HEMISPHERE SURVEY (VHS) 

Most of the time on the VISTA telescope has been ded- 
icated to large-scale public surveys. Two of these surveys 
that are relevant to cosmology are the Vista Hemisphere 
Survey (VHS) and t he Vista Kilo-Degree Infra-red Galaxy 
Survey (VIKING) - (|Arnaboldi et al.ll2007^ . 

The VISTA Hemisphere Survey is a proposed 
panoramic infra-red survey that will image the entire south- 
ern sky (~20000deg^) in the near infra-red YJHKs bands 
when combined with other public surveys. About 40% of the 
total VHS time has been dedicated to VHS-DES, a 4500deg^ 
survey being carried out in the DES region over 125 nights in 
order to complement the DES optical data with near infra- 
red data. The initial proposal is for the survey to image in 
the JHKs bands with 120s exposure times in each band 
reaching lOcr magnitude limits of J = 20.4,_ff = 20.0 and 
Ks = 19.4. A second pass may then be obtained with 240s 
exposures in each of the three NIR filters in order to reach 
the full-depth required by DES. The VHS-DES survey as- 
sumes that y-band photometry will come from the Dark 
Energy Survey. 

The remaining 500deg^ of the DES area not covered by 
VHS-DES, will be imaged by VIKING which is a near infra- 
red survey designed to provide an important complement to 
the optical KIDS project being carried out on the VST. The 
details for all these surveys are summarised in Table [1] 
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4 SIMULATING MOCK DATA 

In this work, we have used two sets of mock galaxy samples 
as simulations of data from DES and VHS. In this section 
we briefly describe the way in which these data samples 
are generated. Both catalogues are generated using Monte 
Carlo methods after assuming relevant redshift, magnitude 
and type distributions. 



4.1 DESSyr Sample 

The first mock catalogue is that of lOvaizu et al.l (120061 ) 
and iLin et all (j2004l ) - DESSyr hereafter. It adopts the 
galaxy magnitude-redshi ft distribu t ion d erived from the 
luminosity functions of iLin et al] (|l999l ) and IPoli et al] 
(|2003h and a type distrib ution derived using data from 
the GOODS /HDF - N fie ld (|Capak et all l2004l : Iwirth et al.1 
I2OO4I: ICowie et al.l |2004| ) and the CWW template SEDs 
(jColeman et al.lll980h . A flux-limited sample is constructed 
with < z <2 and 20 < i < 24. The photometric errors on 
each object are computed according to the DES lOa griz 
magnitude limits. Note, no attempt is made here to fit a 
reddening value to each galaxy. 



5 ESTIMATING PHOTOMETRIC REDSHIFTS 
USING ARTIFICIAL NEURAL NETWORKS: 
ANNZ 

Methods of estimating photometric redshifts fall into two 
broad categories, namely template fitting methods and em- 
pirical methods. Template fitting methods use libraries 
of galaxy spectral energy distributions s uch as the ob- 
serve d Coleman, Wu & Weedma n templates (|Coleman et al.l 
19801) or synthetic templat es e.g. [ Bruzual fc CharlotI ( 1993t ) ; 
Fioc fc Rocca-Volmerangg (|l997 ). The spectra are con- 
volved with a filter transmission function in order to cal- 
culate the flux through each filter in the filter set being used 
to observe the object. The fiuxes can then be matched to 
the observed fluxes of different objects using a minimisa- 
tion to output the best-fit redshift and type of the galaxy. 
Popular photo-z codes that us e this method include HyperZ 
(jBolzonella et al.ll2000l 'l . BPZ (jBenftejl200ol ) and many oth- 
ers. 

Empirical methods on the other hand rely on the avail- 
ability of a suitably representative training set that can be 
used to determine the functional relation 



z = z{rn, w) 



(1) 



4.2 JPL Mock Catalogue 

The second catalogue is described in lAbdalla et all (|2007l ) - 
JPLCAT hereafter. Templates constructed from broadband 
photo metry using a method similar to iBudavari et. all 
l| 19991 ) were fit to real objects from the GOODS-N spectro- 
scopic sample ijCowie et al.ll2004lWirth et al1l2004l ) in order 
to generate this catalogue. The templates were de-reddened 
and at the time of fitting, the best fit SED and reddening 
value were foun d simultaneously. A Calzetti reddening law 
ijCalzettil I1997I ) was used. Further details of the method 
used to create photometric data for the catalogue with the 
correct redshift distribution an d lumi nosity evolution, can 
be found in §2 of lAbdalla et al.1 (|2007l ). For the purposes of 
this paper, however, the important difference between this 
and the DES5yr sample is the fact that the galaxies are 
reddened and corrected for dust extinction. The JPLCAT 
sampl e also uses two more templates from iKinnev et al.l 
l|l996l ) to fit the galaxies in addition to the CWW templates 
used for the DESSyr sample. For the work described in the 
rest of this paper, the JPLCAT sample when used has been 
cut so as to have the same magnitude and redshift limits 
as the DESSyr sample. The redshift distribution for both 
samples cut to include the same number of galaxies, as well 
as the distribution of galaxies in the JPLCAT sample for 
different values of the extinction parameter, A^j, are shown 
in Figure [T] 

Note that the JPLCAT sample is more complex and 
hence the results from it are likely to be more pessimistic 
than those for the DESSyr sample. However, both catalogues 
are generated by fitting models to real data and it is not 
obvious which of these models captures the true colour vari- 
ance best. Hence both can be taken as realistic possibilities 
for modelling the DES and VHS data samples. 



where the redshift is some function of the magnitudes, m, 
and some weights, w. 

Once the redshift is known as a function of the mag- 
nitude, this relation can be applied to a data set where 
only the magnitude is known in order to determine the red- 
shift. Examples of thi s method include poly nomial fitting 
(ICorinollv et al.l ll99Sl ). nearest neighb ours (ICsabai et aD 
2003|) and artificial neural networks (jCoUister fc Lahav 
20041 ) among others. 

Artificial Neural Networks have been shown to pro- 
duce com petitive results c ompared to other training set 
methods fFirth ct al.' '2003|) and we use the code ANNz 
(jCoUister fc Lahav. 2004 ) to calculate photometric redshifts 
in all the work that is described in this paper. The neural 
network is made up of several layers, each consisting of a 
number of nodes. The first layer receives the galaxy magni- 
tudes in different filters as inputs and the last layer outputs 
the estimated photometric redshift. All nodes in the hid- 
den layers in between are interconnected and connections 
between nodes i and j have an associated weight, Wij. 

ANNz, like all other empirical methods, requires a train- 
ing set that is used to minimise the cost function, E (Eq.[2ll 
with respect to the free parameters Wij. 

E = ^^{Zphot{Wij,mk) — Ztrain,k)^ (2) 
k 

The neural network setup is illustrated in Figure [5] If 
the data is noisy, a validation set may be used in addition 
to the training set to prevent over-fitting. During the initial 
setup, one has to specify the architecture of the neural net- 
work - the number of hidden layers and nodes in each hidden 
layer. We choose this to be N:2N:2N:1 throughout this work 
unless otherwise mentioned, where N is the number of fil- 
ters used for photometry. Note that we have tried changing 
both the number of hidden layers as well as the number of 
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Spectroscopic Redshift \ 

Figure 1. The distributions characterising our simulated catalogues for the DES and DES+VHS samples. The left-hand panel shows 
the redshift distribution for ~ 30000 galaxies in the DESSyr and JPLCAT simulations. All galaxies have 20 < i < 24 and < ^ < 2. The 
right-hand panel shows the distribution of galaxies as a function of the extinction parameter, Ay for the JPLCAT sample. 
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Figure 2. Sch ematic diagram of neura l network as implemented 
by ANNz from lCoUister fc Lahavl bOO^ ). The input layer consists 
of nodes that take magnitudes in the different filters used for 
photometry. A single hidden layer consisting of p nodes is shown 
here although more hidden layers could be used. The output layer 
has a single node that gives the photometric redshift. Once again 
further nodes for more outputs such as spectral type could be 
added to this layer. Each connecting line between nodes carries a 
weight, Wij. The bias node allows for an additive constant when 
optimising weights. 



where the sum z is a sum over all the network inputs and Smt 
is the photometric error on the magnitude in band i. The 
derivati ve is obtained using the formalism described in 
iBishod lIlQQsli. This algorithm is fully implemented within 
ANNz jCollister fc Lahavll2004l ). 



6 PHOTOMETRIC REDSHIFT ANALYSIS 
6.1 Choice of Filters 

In this section we look at the impact of different filter com- 
binations and survey depths on the photometric redshift es- 
timate. We do this by running the neural network code de- 
scribed in ^on the DESSyr sample described in ijl] ANNz 
was run on the mock data for five different filter configura- 
tions. These are summarised in Tabled 

We computed photometric redshifts for each of these 
cases and from the available true redshifts, computed the 
scatter on the photo-z estimate. The scatter is the rms pho- 
tometric redshift error around the mean and is defined in 
the following way: 



nodes in each hidden layer and find that it makes very little 
difference to the photometric redshift estimate. 

The neural network code produces an estimate of the 
error associated with each photometric redshift estimate in 
addition to the photo-z estimate. This error depends on the 
noise on the neural network inputs and not on the difference 
between the spectroscopic and photometric redshifts. The 
variance that this noise on the input would introduce into 
the output of the network is given by a simple chain rule 
expression as follows: 



(Tz = {{Zspec - Zphotf) ^ (4) 

where the scatter is evaluated in a redshift bin between zi 
and 22- 

Figure [3] and Figured show the results of this study. We 
can see that inclusion of the NIR filters leads to an improve- 
ment in cjz by ~30% for zyl.G^ is 0.2 for z > 1 for the DES 
only sample and 0.15 for the DES-I-VHS sample in the same 
redshift range. Increasing the exposure time in the NIR also 
leads to improved scatter on the photometric redshift. The 
scatter is high at low redshifts due to lack of M-band imag- 
ing. These results are consistent with those of lAbdalla et al.l 





Figure 3. Scatter plots of photometric redsliifts as a function of tlie true redshifts for each of the different survey configurations detailed 
ni Table E] These plots are generated for a sample of 5000 galaxies. 
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Table 2. Summary of filter configurations of DBS and VHS con- 
sidered in i|6.1l 
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Figure 4. The Icr scatter on the photometric redshift as a func- 
tion of the spectroscopic redshift for each of the survey configura- 
tions detailed in Table[2] Curves are labelled 1 to 5 corresponding 
to the numbers in Table |2] 



ll200it). Pfhe Dark Energy Survey Collaboratioiil (|2005l 'l and 
lOyaizu et al.l (|2006l ). 

As can be seen in Figure |3] however, there are many 
outliers present in the sample. Another useful quantity to 
consider is therefore ues which is the interval in which 68% 
of the galaxies have the smallest difference between their 
spectroscopic and photometric redshifts. This will give us 
some indication of the scatter in the photometric redshift 
estimate once the outliers have been removed. We find that 
for DES grizY photometry, is 0.13 across the entire red- 
shift range of < 2; < 2 whereas crgg is 0.08. When we add 
the VHS JHKs photometry to this, ct^ improves to 0.11 
across the entire redshift range and o-gg improves to 0.07. 



6.2 Impact of Galactic Reddening 

In this section, we look at the impact of reddening on the 
photometric redshift estimate. We have already discussed 
how the DESSyr and JPLCAT samples differ in their inclu- 
sion of reddening in the galaxy samples. In order to assess 
how this diff'erence affects the photo-z estimate, we run our 
neural network code on the JPLCAT sample with 5-band 
DES optical photometry as well as 8-band DES-fVHS pho- 
tometry with an exposure time of 120s in the NIR. The 
results are shown in Figure [5] where we plot the la scatter 
defined in Eq. |3] as a function of the spectroscopic redshift 
for both the DESSyr and JPLCAT samples for each of the 
two filter configurations. Note that before comparing the two 
catalogues, the JPLCAT sample has been cut to have the 
same magnitude and redshift limits as the DESSyr sample 
i.e. < 2 < 2 and 20 < I < 24. 

Although the same improvement is noted with inclusion 
of the NIR filters as discussed in i\6.1\ we find that the effects 
of reddening worsen the photo-z scatter overall by ~ 30% in 
some regions. This can be explained by the fact that there 
exists a degeneracy between redshift and galaxy reddening 
which means that faint reddened galaxies at low redshift 
can often appear to have the same colo urs as brighter galax - 
ies at high redshift wi th no reddenin g llAbdalla et aLll2007l ). 
However Figure 12 of lAbdalla et al.l (|2007l ') shows that this 
degeneracy is broken in the redshift range 1.1 < z < 1.5 
and we can see that the reddened DES only catalogues have 
a similar scatter to their unreddened counterparts in this 
redshift range. These authors have also shown that galaxies 
with small values of Av have relatively good photo-z esti- 
mates whereas those with high are scattered towards 
higher photometric redshifts. Figure [T] shows that most of 
the galaxies in our JPLCAT sample have relatively small 
values of A^ and therefore while we need to be aware that 
any amount of dust extinction is likely to affect our photo-z 
estimate, this effect should only be small for the DES sam- 
ple. 

In order to account for this effect of the dust extinction 
on the photometric redshift estimate, some authors attempt 
to include the d ust extinction, A^ as a f ree parameter in 
their codes (e.g. iRowan-Robinsoiil (|2003l ): iBolzonella et al.l 
(|2000l )) and simultaneoulsy solve for this and the photomet- 
ric redshift. We have modified our neural network code to 
produce estimates for the Av and SED type of the galaxy 
using the JPLCAT sample with 8-band DES-I-VHS photom- 
etry. We use a 8:16:16:2 architecture for the neural network 
and marginalise over the redshift estimate. 

The results are shown in Figure |B] where we plot density 
plots of the true A„ against the predicted A„ and the true 
type against the predicted type. 

We find the rms scatter around the mean of the A^ 
estimate to be 0.27 and the bias to be 0.0031. The predicted 
Av is found to be biased towards lower values of A„ for 
galaxies with a high degree of reddening and towards higher 
values of Av for galaxies with a low value of reddening. We 
also find the scatter on the type to be 7.7 and the bias to be 
-0.0048. The JPLCAT sample ha s been generated usin g six 
SED temp lates - E, Sbc, Scd, Im (|Coleman et al.lll980h and 
SB2, SB3 (|Kinnev et al.ll 19961 ') corresponding to types 0, 10, 
20, 30, 40 and 50. As the error on the type is smaller than 
the difference between these templates, we can effectively 
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Figure 6. Density plot of the ANNz output when the neural network code is used to simultaneously predict the Ay and type of an 
object. The left-hand plot shows the predicted dust extinction, Av as a function of the true Av The right-hand plot shows the predicted 
SED type of each galaxy as a function of the true type. The plots are colour-coded and the scale is exponential; a colour difference of 
one is equivalent to the density being decreased by a factor of e. The solid black lines show where the true Av and true type are equal 
to the predicted Ay and predicted type. 
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Figure 5. The Icr scatter on the photometric redshift for DES 
with and without VHS NIR data for two different mock cata- 
logues. The black lines are produced by DESSyr catalogues that 
do not include the effects of reddening. The green lines are pro- 
duced by the JPLCAT mocks which include the effect of redden- 
ing. The solid lines show the scatter without VHS NIR data while 
the dashed lines include VHS NIR data. For both sets of mocks, 
the VHS NIR data improves the photo-z scatter by a factor of ~2 
at z> 1 In regions of interest, the photoz scatter is worsened by 
~30% when we include reddening in our mocks. 



use ANNz to classify our galaxies into 50/7.7 = about six 
or seven spectral types within the context of DES. 

Note that in this work, we have made no attempt to 
optimise our neural network for the calculation of the A-u and 
type. We simply note that it is possible to use our neural 
network code to produce estimates for these quantities as 
well as the redshift and that this may be useful for samples 
where we know there is a high degree of reddening. 



Through the rest of this work, we have used the DESSyr 
sample for all the analysis. 

6.3 Clipped Catalogues 

In the previous sections we have seen that catastrophic er- 
rors in the photometric redshift estimate can arise depend- 
ing on the exact filter configuration and the galaxy forma- 
tion science encoded within mock catalogues. Given that 
there are likely to be a host of different reasons why the 
photometric redshift estimate may be prone to large errors, 
a lot of which we do not fully understand, it seems sensi- 
ble to devise some way of clipping a sample. This is done 
by removing galaxies with large photo-z errors before using 
the photo-z estimate for cosmological analysis. In most sit- 
uations where photometric redshift analysis is particularly 
powerful, we do not know the spectroscopic redshift of the 
galaxies and therefore have no way of using this information 
to assess whether the photo-z estimate is accurate. However, 
the photo-z prediction will depend strongly on the errors in 
the photometry and this in formation could pote ntially be 
used to clip our sample as in lAbdalla et al] (|2007l ). In order 
to do this, we consider the neural network error estimate 
on the photometric redshift for each galaxy as given by Eq. 
[3l We then remove all galaxies from our sample that have 
an estimated error greater than a chosen threshold, therefore 
resulting in a clipped catalogue of galaxies. We use the quan- 
tity, (768 introduced in § 16.11 to quantify the scatter in the 
photometric redshift estimate once the outlier s have been re- 
moved . In this section, we extend the work of lAbdalla et al.l 
(|2007l ) and look at how the scatter on the photo-z estimate 
varies with different clipping thresholds. 

Figure [7] illustrates the results of this study. In this fig- 
ure we also plot the fraction of galaxies that remain in the 
sample once the clipping thresholds are applied. In each case, 
we plot results using a 5-band DES optical grizY catalogue 
as well as an 8-band DES-I-VHS optical and NIR catalogue. 

As expected, applying smaller threshold errors at which 
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Figure 7. The left hand plot shows the scatter, (T^, and aag as a function of the clipping threshold. The right-hand plot shows the 
fraction of galaxies remaining in the sample after the cuts are applied as a function of the clipping threshold. Galaxies with ANNz errors 
above the clipping threshold are removed from our sample. 



to cut our sample results in a fall in the la scatter for the 
entire sample, a^g also decreases as we reduce the threshold 
error although this decrease is less steep than the decrease in 
az- Both the scatter and tJeg are larger for the DES sample 
compared to the DES+VHS sample. We can see that ap- 
plying a fairly conservative cut of 0.1 to our mock samples 
results in a reduction in a, by a factor of ~ 1.5 for both 
the DES and DES-I-VHS samples. In both cases we retain 
about 80% of our original sample after this cut .We can ap- 
ply smaller threshold errors in order to reduce the scatter on 
our photometric redshift estimate further. However, as we 
do this, we lose more galaxies from our original sample and 
at some point the number of galaxies remaining will prove 
insufficient for statistical analysis. We examine this point in 
more detail later in ij7.1l 



6.4 Impact of Training Sets 

6.4-1 Effect on the Photo-z Scatter 

All the photometric redshift analysis carried out in the pre- 
vious sections assume that the training set used to train the 
neural network is totally representative of the testing set. 
However in reality, this may not always be the case. In this 
section, we look at the impact of using different training 
sets with different imposed colour and magnitude cuts on 
the photometric redshift estimate. 

The Dark Energy Survey region overlaps with that of 
several other current and future photometric and spectro- 
scopic surveys thereby providing it with a fairly complete 
sample of training set galaxies. Some of these are detailed in 
Table |3l Here we consider two of the deeper surveys, namely 
DEEP2 and VVDS-Deep and model our training sets on 
the redshift distributions of these surveys before performing 
the usual photometric redshift analysis. Note that as DES 
overlaps the VVDS-Deep and DEEP2 fields, we assume that 
objects in these spectroscopic surveys will be imaged in all 
the DES bands. As the DESSyr sample is magnitude lim- 
ited as described in ^ we do not consider the SDSS and 



2dFGRS training sets as these objects are brighter than the 
mocks considered. 

DEEP2 is an ongoing spectroscopic survey being carried 
out by the DEIMOS spectrograph on the Keck II telescope. 
On completion, it will have obtained spectroscopic redshifts 
for ~ 54000 objects over an area of 3.5deg^. The survey has 
been designed to sample the redshift range of 0.75 < z < 
1.5 and the spectrograph is capable of obtaining moderately 
high resolution spectra between 6300A and 9100A. Targets 
are pre-selected using BRI imaging on the CFH12k camera 
on the Canada-France-Hawaii Telescope with a magnitude 
limit of Rab < 24.1 and the colour cuts detailed in Table [3] 
imposed in order to sample the redshift range of interest. 

In this s tudy, we use 4681 objects with spectra from 
DEEP2 DRl (|Davis et al.ll2003h to construct the normalised 
redshift distribution for the DEEP2 survey. This is plotted 
in Figure [S] As can be seen, there are very few objects with 
redshifts less than ~ 0.7 and greater than ~ 1.4. This is 
because the wavelength range for the spectrograph has been 
chosen such that the strong [OH] doublet which has a rest- 
frame wavelength of 3727A lies outside these wavelengths for 
all other redshifts. Note that we have not included data from 
the Groth Survey Strip region in this study. This survey field 
has no imposed colour cuts and therefore may be useful for 
sampling the low- redshift range of DES. 

The VVDS spectroscopic surveys are being carried out 
using the VIMOS spectrograph on the Very Large Telescope 
(VLT). There is a shallow survey out to Iab = 22.5 planned 
in 5 fields and a deeper survey out to Iab = 24 in a single 
field. Targets are pre-selected using magnitudes from the 
imaging survey being carried out in the UBVRI bands using 
the CFH12k camera on the CFHT. 

The latest catalogue contains 898 1 objects upto Iab = 
24 in the redshift range < z < 5.228 (|Le Fevre et al.ll2005l . 
l2004h and has been used to construct the normalised VVDS- 
Deep redshift distribution plotted in Figure [8] Note that for 
both the DEEP2 and VVDS-Deep samples, we have removed 
stars and other objects with very low redshifts as well as 
high-redshift objects with z > 2 before plotting the redshift 
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REDSHIFT SURVEY 


SELECTION CRITERIA 


NO. OF REDSHIFTS 


SDSS Stripe 82 


r < 20 


70000 


2dFGRS 


bj < 19.45 


90000 


VVDS Shallow 


Iab < 22.5 


~ 26000 


VVDS Deep 


Iab < 24 


~ 60000 


DEEP2 


{B ~ R) < 0.4 


~ 54000 




{R- I)> 1.25 






{B - R) < 2.35{iJ -I)- 0.54 





Table 3. Summary of some of the spectroscopic surveys that will provide useful training sets for DES along with their imposed colour 
and magnitude cuts and the number of redshifts they are expected to obtain on completion. 



distributions so as to match the redshift range of the DESSyr 
sample. 

Having obtained the redshift distributions for both 
DEEP2 and VVDS-Deep, we proceed to construct accurate 
training sets that simulate these surveys to be used when 
running our neural network code. This is done as follows. We 
first seperate our DESSyr sample of 1 million objects into 
two equal sized training and testing sets. We then divide the 
training set into 20 redshift bins and from the DEEP2 and 
VVDS-Deep redshift distributions, calculate A'^;, the number 
of galaxies from these surveys that would be present in each 
redshift bin, i once the survey is complete and has obtained 
spectra for the number of objects given in Table [3] We then 
randomly choose Ni galaxies from the DESSyr training set 
to be put into redshift bin i and in this way we construct a 
new training set of galaxies that have the same redshift dis- 
tribution as our real spectroscopic surveys. The new train- 
ing sets are then split further in order to create validation 
sets for ANNz to run on. This is done for a DES catalogue 
with optical grizY photometry as well as a DES+VHS cata- 
logue with 8-band optical and NIR JHKs photometry. Our 
simulated DEEP2 sample has many more galaxies at inter- 
mediate redshifts whereas the simulated VVDS-Deep sur- 
vey samples the low and high redshift regimes better than 
DEEP2. 

We then run our neural network code on the two DES 
catalogues using three different training sets each time - a 
training set with a DES redshift distribution, one with a 
DEEP2 redshift distribution and one with a VVDS-Deep 
redshift distribution. The results are shown in Figure [5] 
where we plot the scatter on the photometric redshift as 
a function of the spectroscopic redshift for all these cases. 

At low redshifts, the scatter is large for all three train- 
ing sets due to lack of u-band data. At intermediate redshifts 
of 0.7S < 2 < 1.4, the DEEP2 sample does better than the 
other training sets by ~ 40% as all its galaxies are concen- 
trated in this region. As we move to even higher redshifts, 
the DEEP2 sample gives very poor results due to a lack of 
training set galaxies in this redshift range whereas the DES 
and VVDS-Deep samples perform better. For 1.4 < 2 < 2, 
the DEEP2 scatter is worse by a factor of ~2 compared to 
the VVDS-Deep and DES training sets. As expected, the 
scatter is smaller overall when we include NIR photometry 
for all three training set scenarios. The improvement is par- 
ticularly noteworthy in the high redshift regime. Here, the 
scatter is reduced more for the VVDS-Deep and DES train- 
ing sets with the addition of the NIR and not as much for 
the DEEP2 training set. 

We can therefore clearly see that using a combination 
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Spectroscopic Redshift 

Figure 8. The normalised redshift distributions for DES, VVDS- 
Deep and DEEP2 surveys between < z < 2. As can be seen, 
the DEEP2 colour cuts mean that most objects lie in the redshift 
range 0.7 < z < 1.4. VVDS-Deep on the other hand effectively 
samples the entire DES redshift range although it has fewer galax- 
ies at intermediate redshifts. 



of DEEP2 and VVDS-Deep data to calibrate our DES pho- 
tometric redshifts, is already as good as having a complete 
training set for DES. 

6.4-2 Effect on the Photo-z Bias 

The bias on the photometric redshift estimate, foz, in a given 
redshift bin between 21 and 22 is given by: 

^2 ~ (2spec Zphot) (S) 

This bias can arise from various sources. The perfor- 
mance of the neural network will introduce some difference 
between the photometric and spectroscopic redshifts. Fur- 
thermore, having an incomplete training set or a cosmic 
variance limited sample also leads to biases in the photo- 
metric redshift estimate. If this bias does not depend on the 
testing set however, we can quantify it exactly using our 
training set and it can be subtracted from the photometric 
redshift estimate no matter how large it is. Once this is done, 
the residuals give us some indication of the dependance of 
the bias on the testing set. This error on the bias cannot be 
corrected for and it is this quantity that we evaluate in this 
section. 
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Figure 9. The scatter on the photometric redshift as a function of the spectroscopic redshift when the DES, DEEP2 and VVDS-Deep 
redshift distributions arc used to construct the training set used by the neural network. The left-hand plot shows the scatter for a 
catalogue with optical grizY photometry and the right-hand panel shows the scatter for a catalogue with 8-band optical and NIR JHKs 
photometry. The scatter is big at low redshifts for all three training sots due to a lack of w-band photometry. At intermediate rcdshifts, 
the DEEP2 sample performs best as all its galaxies are concentrated in this redshift range. Both the DES and VVDS-Deep training sets 
produce considerably less scatter than the DEEP2 training set at high redshifts. 



We quantify the errors in the bias that arise from using 
different training and testing sets. In particular we look at 
the effects of size and incompleteness of both the training 
and testing sets. All the analysis carried out here is for the 
DES-fVHS dataset and we model the incomplete training 
sets by imposing the DEEP2 colour cuts detailed in Table [3] 
on galaxies from the DES5yr catalogue. 

The standard deviation on the bias in each bin can be 
defined as follows assuming Poisson statistics 

rms{b,) = (6) 

where is the la scatter on the photometric redshift given 
by Eq.|4]and Ns is the number of spectroscopic training set 
galaxies. Given a suitably large number of training set galax- 
ies with spectroscopic redshifts, this error on the bias can 
be effectively ignored as it does not depend on the testing 
set. In order to better understand some of the other sources 
of error on the bias that do depend on the testing set, we 
study the samples detailed in Table |4] 

We first quantify the difference in the bias when using 
different numbers of training set galaxies to calculate photo- 
metric redshifts for the same sample of testing set galaxies. 
Note that the differences in the bias calculated here are used 
as empirical estimators of the systematic shift one would get 
when dealing with data. SETS and SET4 are used as training 
sets to calculate photometric redshifts for SETe. All three 
samples are complete and have no imposed colour cuts. In 



^ We have checked that this is a reasonable approximation to the 
error arising from the neural network and the analytical expres- 
sion for the error on the bias agrees with the error from the neural 
network to within 15%. 





Number of Galaxies 


Colour Cuts 


SETl 


50000 


None 


SET2 


50000 


Same as DEEP2 


SET3 


70000 


None 


SET4 


200000 


None 


SETa 


250000 


None 


SETb 


250000 


None 


SETc 


102643 


Same as DEEP2 


SETd 


102589 


Same as DEEP2 


SETe 


500000 


None 



Table 4. Summary of training and testing sets used to quantify 
differences in estimates of the photometric redshift bias. 

Figure [10] we plot the biases obtained for each of the two 
cases and the difference between these. This shows us that 
changing the size of our training set by a factor of ~ 3 leads 
to a difference in the biases of the order of 10^'^. 

We proceed now to look at effects of incomplete train- 
ing sets on the photometric redshift bias. To do this, we use 
SETl and SET2 as training sets to calculate the photometric 
redshifts for samples SETa, SETb, SETc and SETd. SETl 
is a complete training set while SET2 has been cut to refiect 
the colour cuts of the DEEP2 survey. SETa and SETb are 
both complete testing sets with different galaxies in them 
from the DES5yr mock catalogue, while SETc is generated 
by imposing the DEEP2 colour cuts on SETb and SETd by 
imposing the DEEP2 colour cuts on SETa. The biases on 
the photo-z estimate obtained for each of the different con- 
figurations of training and testing sets, are shown in Figure 
111! Note that throughout this analysis, we use bins of width 
0.04 in redshift space. 

From Figure [TT] we can draw the following general con- 
clusions. Changing the galaxies that are present in the test- 



Photo- z for DES and VHS 11 



0.3 
0.2 
0.1 


-0.1 
-0.2 
-0.3 
-0.4 



SET3&SETe 
SET4&SETe 




0.03 r 



0.5 1 1.5 

Spectroscopic Redshift 



in 

(0 

m 
c 

0) 

o 
c 

0) 
0) 

□ 




-0.03 L 



0.5 1 1.5 

Spectroscopic Redshift 



Figure 10. The bias on the photometric redshift estimate when using a training set of 70000 galaxies (SETS) and when using a training 
set of 200000 different galaxies (SET4) on the same testing set. The right hand panel plots the difference between the two biases. We 
can see that increasing the number of training set galaxies by a factor of ~ 3 leads to a change in the bias of the order of 10~^. 
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Figure 11. The left-hand panel shows the biases on the photometric redshift estimate when using each of the different configurations 
of training and testing sets detailed in Table |4] In the right-hand panel, we plot the difference in the bias when using two different sets 
of DESSyr galaxies with an incomplete training set (broken line), two different sets of DEEP2 galaxies with an incomplete training set 
(green solid line) and the difference in the bias when using a complete testing set and a testing set with the same colour cuts as the 
incomplete training sot (black solid line). These plots are generated using bins of width 0.04 in redshift space. 



ing set when training with a complete and fully representa- 
tive training set, leads to a change in the bias of the order 
10^'^. Using an incomplete training set on these samples 
also leads to the same difference in the bias between them 
(Figure [11] - right panel, broken line) . When using an in- 
complete training set such as that provided by the DEEP2 
survey (SET2) on a testing set of galaxies with no imposed 
colour cuts (SETa and SETb), the bias in the photometric 
redshift is worsened as expected. However, if we impose the 
same colour cuts as DEEP2 on our testing set (SETc), the 



bias is improved by ~20% for z < 0.7 and by ~40% for 
z > lA - i.e. in the redshift ranges where the training set is 
incomplete. The difference in the bias when using a testing 
set with no colour cuts and one with imposed colour cuts to 
match those of the incomplete training set, is of the order 
of 5 X 10~^ and always smaller than 10~^ in the redshift 
range of the incomplete training set - inset, left-hand panel 
of Figure [TT] 

In ^7.2\ we will briefly comment on how the errors on 
the photometric redshift bias propagate into errors in the 
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calculation of the dark energy equation of state parameter, 
w. 



7 IMPLICATIONS FOR COSMOLOGY 

7.1 Optimal Estimation of the Galaxy Power 
Spectrum 

In this section, we look at the impact of photometric redshift 
estimation on dark energy science. In particular, we concen- 
trate on the measurement of dark energy using galaxy power 
spectra and baryon acoustic oscillations. 

The galaxy angular power spectrum is a measure of the 
clustering in the galaxy population within time bins extend- 
ing from the present to a time when the Universe was only 
a third of its present age. Large-scale surveys like DES pro- 
vide ideal data sets for studying the clustering properties 
of galaxies and therefore the clustering properties of their 
underlying dark matter distribution and hence are useful 
probes for mapping how the dark matter distribution evolves 
with time. Furthermore, many other characteristic features 
appear in the power spectrum which provide standard rulers 
that can be used to determine the angular diameter distance. 
Da, as a function of redshift. Baryon acoustic oscillations are 
one such feature of interest which appear as wiggles on the 
power spectrum. The position of the peaks and troughs of 
these wiggles in Fourier space c an be used to determine a set 
of cosmological parameters e.g. iBlake fc Glazebrookl (|2003l ) 
and lSco fc Eiscnstcini (,200j). 

The accuracy with which we can measure this typical 
acoustic scale is proportional to the average fractional error 
in the power spectrum, 5P/P. The fractional error on the 
power spectrum arises from two sources. Firstly, the num- 
ber of independant spatial modes that we can measure in a 
given volume is finite and this will lead to errors in the power 
spectrum that are proportional to l/\/V. This is known as 
cosmic variance. Secondly, there is a contribution from shot 
noise due to imperfect sampling of the fluctuations as we 
only have a finite number of tracers of these fluctuations 
within a given volume. If we assume a density field that fol - 
lows Gaussian statistics, we can follow iF^eldman et al.l (|l994l ) 
and assume the error on the power spectrum measurement, 
P is weighted in the following way: 



5P 



(7) 



where n is the mean number density in a given volume as 
seen by an observer and can be written in terms of the galaxy 
redshift distribution as follows: 



dN 
dz 



(8) 



where fsky is the fraction of the sky covered by the survey 
and dV/dz is the comoving volume element. 

The first term in Eq. [7] denotes the effect of cosmic 
variance while the second term is the contribution from shot 
noise. In order to minimise the error on the power spectrum, 
one has to design a survey with maximum volume provided 
there are enough sources within this volume for the shot 



noise contribution to be minimal. If nP > 3 the power spec- 
trum is well estimated and there is no significant advantag e 
to be gained with more galaxies (|Seo fc EisensteirJ |2003| ). 
In this work, we assume that to obtain a reasonable esti- 
mate of the power spectrum we need to satisfy the condition, 
nP > 1. Taking into account the galaxy bias, b that scales 
the galaxy power spectrum to the matter power spectrum, 
and including the scaling of the matter power spectrum with 
redshift as a linear growth factor, D{z) we get the following 
expression for nPgai- 



n{z)Pgai{k,) = n{z)b'{z)D^{z)P{k,) 



(9) 



We h ave used the formalism for the transfer function 
set out in lEisenstein fc Hul (|l998l ) to calculate our power 
spectrum at fc, = Q.lhMpc~^ as this is well within the linear 
regime of the power spectrum. At larger values of k, non- 
linearities due to clustering and other structure formation 
start to dominate and make it harder to detect the BAO 
signal. 

We assume a survey with < z < 2 and f^ky = 0.119. 
The bias is assumed to be 1.20. In Figure [12] we plot nPgai 
as a function of the redshift. This is done for the entire 
catalogue and for clipped catalogues with different clipping 
thresholds. We perform the same analysis for optical only 
DES data as well as optical and NIR data from DES-I-VHS. 
The results are summarised in Tableland Figure [T^ 

From these results we can see that applying a thresh- 
old error at which to cut our photometric redshift catalogue 
proves effective in removing outliers from our sample before 
performing any kind of cosmological analysis on it. For the 
DES catalogue of redshifts obtained using grizY photome- 
try, we can remove all galaxies with a threshold error of more 
than ~ 0.03 in order to obtain an accurate measurement of 
the galaxy power spectrum out to a redshift of 1. This leaves 
us with only 10% of our original sample but this sample has 
a scatter on its photometric redshift that is a factor of ~ 2.7 
times better than that of the original sample and is therefore 
more effective in constraining the cosmology. For the DES 
-I- VHS catalogue, a threshold error of 0.025 can be applied 
to effectively constrain the galaxy power spectrum to red- 
shift 1. This leaves us with only 3% of our original sample, 
but the overall scatter on the photometric redshift has been 
reduced by a factor of ~ 2.75. This is equivalent to perform- 
ing an LRG selection on our survey as these galaxies have 
more accurate large-scale structure signals and more accu- 
rate photome tric redshifts due to the prominence of their 
4000 A break (jBlake et al1l2007l : [Padmanabh a n ct al. 2005|). 

In order to provide a reasonable measurement of the 
galaxy power spectrum for the entire DES redshift range of 
< 2 < 2, we can apply a threshold error cut of > 0.1 to 
the DES only sample and use most of the galaxies in our 
analysis. When we add NIR photometry from VHS to our 
sample, a less conservative clipping cut of 0.05 can be applied 
and only 37% of the galaxies used to reduce the scatter on 
the photometric redshift by a factor of ~ 2. Note that ffes is 



^ While wc arc aware of the depcndancc of our results on this 
bias, it is difficult at this point to make an educated guess of what 
b will be for DES galaxies. Wc have therefore used a reasonable 
scale independant bias in our calculations. 
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DES grizY photometry 



inresnoid horror 


Redshift Range for nP > 1 


(T 


0"68 


Fraction of Galaxies Remaining 


None 


< 2 < 2 


0.128 


0.08 


1 


0.100 


< z < 1.7 


0.084 


0.065 


0.79 


0.050 


< z < 1.3 


0.058 


0.052 


0.38 


0.040 


< z < 1.1 


0.055 


0.047 


0.24 


0.030 


< z < 1.05 


0.048 


0.044 


0.095 


0.025 


< z < 0.95 


0.047 


0.043 


0.035 


0.020 


< 2 < 0.8 


0.047 


0.045 


0.007 




DES grizY + VHS JHKs 


photometry 


Threshold Error 


Redshift Range for nP > 1 


tj 


0"68 


Fraction of Galaxies Remaining 


None 


< z < 2 


0.11 


0.074 


1 


0.100 


< 2 < 1.9 


0.074 


0.062 


0.80 


0.050 


< 2 < 1.4 


0.054 


0.048 


0.37 


0.040 


< z < 1.3 


0.049 


0.043 


0.22 


0.030 


< 2 < 1.1 


0.043 


0.039 


0.09 


0.025 


< 2 < 1.0 


0.041 


0.037 


0.03 


0.020 


< z < 0.65 


0.041 


0.037 


0.005 



Table 5. Summary of the redshift ranges over which we can obtain optimal measurements of the power spectrum for different clipping 
threshold errors and the corresponding values of cr, ergs and the fraction of galaxies remaining in our sample for each of these cases. The 
top table is for DES grizY photometry and the bottom table for DES+VHS JHKs photometry. 
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Figure 12. nP as a function of the redshift for different levels of clipping. The left-hand plot is generated using DES grizY photometry 
only whereas the right-hand plot is produced from a catalogue with 8-band DES+VHS photometry. For the same clipping threshold 
error, the DES-I-VHS power spectrum is greater than 1 over a larger redshift range compared to the DES only case. This means that we 
can obtain an optimal measurement of the power spectrum out to higher redshifts if we use the full 8-band DES + VHS photometry. 



also reduced in these cases although not to the same extent 
as the reduction in a. A reduction in agg corresponds to a 
reduction in the intrinsic scatter of our sample minus the 
outliers. 

By clipping our catalogues in this way before perform- 
ing any kind of cosmological analysis on them, we have ef- 
fectively managed to reduce the errors in our measurement 
of the galaxy power spectrum without compromising on 
the precision with which this measurement has been made. 
Adding NIR data from VHS to our DES photometry has also 
allowed us to clip our catalogues more effectively and there- 



fore make more precise measurements of the galaxy power 
spectrum out to higher redshifts. 

W e now foUow iBlake et al.1 (|2006l ') and iBlake fc Bridie 
and take into account the photometric redshift er- 
rors explicitly in our galaxy power spectrum analysis. These 
authors have shown that the fractional error on the galaxy 
power spectrum is related to the photometric redshift error 
as follows: 
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where ar is the rms error in comoving coordinates in units 
of h-^Mpc. We can relate this to the redshift error already 
introduced in Eq. |4]as follows; 



Hiz) ffox/(t^,n(l + Z)3 + ZY + ^a) 

(11) 

The fractional error in the power spectrum is now given 



by: 



(5P 



1 



(12) 



We can calculate this quantity for different values of the 
clipping threshold. This is done for an optical DES sample 
as well as an optical and NIR DES+VHS sample in redshift 
bins of width 0.02. We assume a DES comoving survey vol- 
ume, V of 23.7/i"^Gpc^ between < z < 2. Our choice of 
redshift is motivated by Figure [12] where it can be seen that 
there are enough galaxies in the survey out to a redshift 
of 2 for shot-noise errors not to be significant, provided we 
do not chp our sample. As we are calculating the fractional 
error in the power spectrum in redshift bins, we define the 
effective volume in a redshift bin as follows: 



n{z)P 



{n{z)P+l) 



5V 



(13) 



The fractional error on the power spectrum in a redshift 
bin can then be re-written as: 



5P 



SVeff 



(14) 



As we can see, decreasing the threshold error at which 
to clip the sample reduces the scatter on the photometric 
redshift and therefore ar leading to smaller fractional errors 
on the power spectrum. However this also reduces nP and 
therefore SVeff, thereby increasing the shot-noise contribu- 
tion to the error in the power spectrum due to a lack of 
sufficient galaxies in the sample. Clearly, there is a thresh- 
old error that needs to be determined and this is what we 
proceed to do. 

The results of our study are shown in Figure [13] and 
Table[S]where we plot the fractional error in the power spec- 
trum for different levels of clipping divided by the fractional 
error in the power spectrum obtained using the entire sam- 
ple. We only do this for clipping thresholds greater than or 
equal to 0.03 as below this, shot noise is dominant across 
the entire redshift range. If the plotted quantity is less than 
one for a given clipping threshold, cutting the sample using 
this threshold improves our constraints on the galaxy power 
spectrum. 

From these results we can clearly see that there exists 
a trade-off between the shot-noise contribution to the error 
on the power spectrum and the contribution from cosmic 
variance. At high values of the threshold error, most of the 
galaxies in the sample are used for analysis and shot-noise is 
not a problem. However, the scatter on the photometric red- 
shift is large leading to larger errors in the power spectrum 
measurement. At very low values of the threshold error, the 
photo-z scatter is reduced but there are too few galaxies in 



the sample and shot noise begins to dominate. There is an 
optimum value of the threshold error at which the fractional 
errors in the power spectrum are at a minimum. This value 
is different for different redshift ranges as well as for the two 
different catalogues. 

When we add the VHS NIR data to the DES optical cat- 
alogue, we can apply a smaller clipping threshold out to the 
same redshift range compared to the DES only case in order 
to minimise the error in the power spectrum. This means 
we remove more outliers from the DES-I-VHS catalogue and 
thereby reduce our photometric redshift errors without com- 
promising on the precision with which we can do cosmology. 
Also, we can see that using the DES only catalogue, we are 
unable to clip in the highest redshift bin of 1.4 < z < 2 as 
this increases the shot noise errors in our power spectrum. 
However, if we add the VHS NIR photometry, we can re- 
move ~20% of our galaxies in this bin and produce a power 
spectrum that is 15-20% more accurate than that obtained 
using all the galaxies. 

We can conclude that in the absence of large 
spectroscopic surveys like the proposed WFMOS survey 
(iBassctt ct al.. ,2005'l , photometric surveys could prove com- 
petitive in constraining dark energy through galaxy power 
spectrum measurements if the outliers were effectively re- 
moved. 

It is worth noting though that when applying this clip- 
ping procedure to a real survey, one would choose the opti- 
mal clipping threshold based on the training sets available 
and not from simulations. 



7.2 Effect of Photometric Redshift Bias on Dark 
Energy Equation of State 

In this section we look at the effect of the photometric red- 
shift bias on the dark energy equation of state parameter, 
w. We have already seen in ii6.4.2l that systematic errors in 
the photometric redshift bias can arise when we use different 
numbers of galaxies in our training and testing sets and also 
when one or both of these samples is in some way incom- 
plete. We can translate the errors in the bias given by the 
right hand panels of Figure [10] and Figure [11] into an error on 
the the value of w calculated using baryon acoustic oscilla- 
tions as a probe. The position of the BAO peaks can be used 
to find the angular diameter distance. Da which in turn tells 
us about the expansion history of the Universe and hence 
w. If there is a systematic uncertainty on the photometric 
redshift bias. A;,, this can be related to the uncertainty in w 
using the angular diameter distance, in the following way: 



Aw 



ODa dw 
dz ODa' 



(15) 



By assuming that A;, is given by the difference curves 
plotted in the right hand panels of Figure \TU\ and Figure 1111 
we can find Aw for each of the cases investigated in i]6.4.2l 
This is shown in Figure 1141 Note that throughout this cal- 
culation we keep all other cosmological parameters constant 
and use a standard cosmology with Qrn~0.3, i^A~0.7 and 
h=0.7. 

For interest, we also plot the product of the two deriva- 
tives in Eq. [15] for four different cosmologies in Figure [15] 
as this is what links the error in the photometric redshift 
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Figure 13. 5P/P for different levels of clipping divided by &P/P for no clipping as a function of the redshift. The left-hand plot 
is generated using DES grizY photometry only whereas the right-hand plot is produced from a catalogue with 8-band DES-I-VHS 
photometry. At small threshold errors, the power spectrum measurement is shot-noise dominated whereas at large threshold errors, large 
photo-z errors result in large positional uncertainties. 



DES grizY photometry 



Redshift Range 


Optimum 


Clipping Threshold 


Fraction of galaxies used 


Improvement in 


SP/P 


< z < 0.1 




0.025 


4% 


75% 




O.Kz < 0.9 




0.03 


10% 


20-30% 




0.9 < z < 1.0 




0.04 


24% 


35% 




1.0 < 2 < 1.2 




0.05 


38% 


10-15% 




1.2 < 2 < 1.4 




0.1 


79% 


4% 




1.4 < 2 < 2 




None 


100% 


None 




DES grizY -f VHS JHKs photometry 


Redshift Range 


Optimum 


Clipping Threshold 


Fraction of galaxies used 


Improvement in 


SP/P 


< z < 1 




0.03 


8% 


30% 




1 < z < 1.3 




0.05 


37% 


10-15% 




1.3 < 2 < 2 




0.1 


80% 


15-20% 





Table 6. Summary of the optimum threshold error to be applied in different redshift ranges in order to minimise the fractional error on 
the power spectrum. The top table is for DES grizY photometry and the bottom table for DES-I-VHS JHKs photometry. 



to the error on w. We assume a flat universe and change 
Sim in order to get four different cosmologies with different 
amounts of matter and dark energy. 

We can see that using different size training sets or dif- 
ferent testing sets leads to an error in it; at a given redshift 
[fithat is of the order of 0.01 for z > 0.5. The error in w is of 
the order of 0.08 for 0.5 < z < 1 and z > 1.5 when we use an 
incomplete testing set with an incomplete training set as op- 
posed to a complete testing set with an incomplete training 
set. The error is smaller - around 0.01 in the redshift range 
1 < z < 1.5. The error in all cases is very large at z < 0.5. 
As using different training and testing sets is equivalent to 
having a cosmic variance limited sample, we conclude that 

^ This does not correspond to the error one would get from fitting 
a constant w but rather the error of measuring w at that given 
redshift. 



systematics on the photo-z bias due to cosmic variance do 
not hinder the calibration of ui to a percent level. However, 
the black solid line of Figure [14] shows us that using an in- 
complete training set to calibrate photometric redshifts for 
DES would lead to large uncertainties in w unless the sam- 
ple being tested was cut to match the training set. These 
results are relatively insensitive to the clipping procedure 
introduced in § 16.31 and therefore have been carried out in 
all cases for the umclipped catalogues of galaxies. 

Although this is a rough calculation, it gives us a feel for 
the uncertainties in calculating the dark energy equation of 
state that can arise due to the differences in the photometric 
redshift bias. In order to get the exact uncertainty on w due 
to uncertainties in the photometric redshift, one would have 
to conduct a full Fisher matrix analysis a pplied to baryon 
acoustic oscillations such as that done by iMa et al.l (|2006l ) 
for weak lensing. This involves translating the error on the 
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Figure 14. The error in ui as a function of redshift due to the 
photometric redshift bias. 
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Figure 15. dDA ^ova different cosmologies with Vim = 

0.1, Qm = 0.3, Qm = 0.5 and Qm = 0.9. All models assume a flat 
universe. 



bias and scatter in the photometric redshift to an error on 
w by marginalising over all other cosmological parameters. 

From this work, we can conclude that although we can 
use neural networks reasonably successfully to obtain photo- 
metric redshifts by extrapolating from an incomplete train- 
ing set, if this is done with a survey such as DES it will 
create systematic errors on w of the order of ~10%. The 
possibility of using a template-fitting method to calculate 
photometric redshifts in regions where the training set is 
incomplete should be further investigated. 



In this work we have shown the role of near infra-red pho- 
tometry from the VISTA Hemisphere Survey in constraining 
photometric redshifts for the Dark Energy Survey. We have 
examined the effects of galaxy reddening and training sets 
on the photo-z estimate. We have also studied the biases in 
the photometric redshift estimate when using training and 
testing sets with different sizes and levels of completeness 
and quantified the error in the dark energy equation of state 
parameter, w that arises from differences in these biases. 

A method of clipping our galaxy catalogues by remov- 
ing outliers based on the ANNz error estimate on the photo- 
metric redshift has been presented. By applying a threshold 
error to our catalogues and rejecting all objects with errors 
bigger than this threshold error, we can effectively reduce 
the overall scatter on the photometric redshifts of our sam- 
ple. 

Finally, we have conducted a full galaxy power spec- 
trum analysis using our DES and DES-I-VHS catalogues and 
looked at how our clipping method can improve the uncer- 
tainties on our galaxy power spectrum measurements. We 
find that there is an optimum threshold error at which we 
should clip our catalogues and this error depends on the 
catalogue being used and the redshift range in which we are 
evaluating the power spectrum. If we use a high value for the 
threshold error, the scatter on our photometric redshift es- 
timate is high leading to large positional uncertainties and 
therefore large errors in the power spectrum due to cos- 
mic variance. However, if we adopt a very low value for our 
threshold error, we remove most of the galaxies from our 
sample before calculating the power spectrum and the re- 
sulting uncertainties in the power spectrum are dominated 
by shot noise. We find that the optimum threshold error 
is smaller for the DES-I-VHS catalogues compared to the 
DES only catalogues in the same redshift range and hence 
more outliers are removed from this sample before analy- 
sis. Adding the VHS NIR data thus helps us to compute 
the galaxy power spectrum more accurately out to higher 
redshifts than for the DES only case. 

In summary, our main conclusions are: 

• NIR data from VISTA VHS helps to reduce the scatter 
on DES photometric redshifts by ~30% for z > 1. 

• Reddening the galaxies can increase the photo-z scatter 
of DES by ~ 30% in some redshift ranges due to the degen- 
eracy between redshift and reddening that exists in these 
redshift ranges. However, this is unlikely to be a major issue 
as most of our mock DES galaxies do not suffer from heavy 
extinction. 

• ANNz can be used to predict the extinction, of DES 
galaxies to an accuracy of 0.27 and to classify them into six 
spectral types - E, Sbc, Scd, Im, SB2 and SB3. 

• The VVDS-Deep and DEEP2 spectroscopic surveys, 
when finished will provide a very complete training set for 
DES out to a redshift of 2. 

• Using different numbers of training set galaxies can lead 
to a difference in the photometric redshift bias of the order 
of 10"^ 

• If we have an incomplete training set, we can improve 
the photometric redshift estimate by imposing the same 
colour cuts on the testing set as are applied to the training 
set. When this is done, the improvement in the photometric 
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redshift bias is of the order of 10% compared to if we used 
a complete testing set with no imposed colou r cuts. 

• T he clipping method introduced by lAbdalla et al.l 

can be effectively applied to the DES+VHS sample 
and applying a threshold error of 0.1 at which to cut our 
sample, reduces the scatter on the photometric redshift by 
~50% by removing ~ 20% of the galaxies. 

• A clipping threshold of 0.1 is optimal for calculating the 
DES power spectrum out to a redshift of 1.4. Applying this 
clipping threshold reduces 5P/P by ~20%. When the VHS 
NIR data is added to the DES sample, the optimal clipping 
threshold in the same redshift range is 0.05 and this reduces 
the fractional error in the power spectrum by ~20-30%. In 
order to calculate the power spectrum out to a redshift of 
2, the addition of VHS NIR data is crucial. In this redshift 
range, applying the optimal clipping threshold of 0.1 results 
in an improvement in SP/P by ~15-20%. 

• Systematic errors on the photometric redshift bias aris- 
ing from cosmic variance lead to uncertainties in the dark 
energy equation of state parameter, w, of about a percent. 

• However if we use an incomplete training set to deter- 
mine photometric redshifts on a testing set that hasn't been 
cut to match the training set, the resulting uncertainties in 
the photometric redshift bias can lead to errors in w of the 
order of ~10% if we keep all other cosmological parame- 
ters as constant. Note however that this result is model spe- 
cific and depends to a certain extent on our choice of mock 
catalogues and the algorithm used to calculate photometric 
redshifts, in this case, the neural networks. 

In the absence of large spectroscopic surveys, the DES 
and VHS datasets, when combined, will prove extremely ef- 
fective in constraining dark energy through large scale struc- 
ture signals like baryon acoustic oscillations. By clipping 
photometric redshift catalogues and carefully removing a 
suitable number of outliers, one can achieve reasonably pre- 
cise measurements of the galaxy power spectrum out to a 
redshift of 2. 



ACKNOWLEDGEMENTS 

We are very grateful to Peter Capak for providing the 
JPLCAT simulations. We thank members of the DES pho- 
tometric redshift and large scale structure working groups 
for useful discussions, in particular Josh Friemann, En- 
rique Gaztanaga and Will Percival. We also thank Richard 
McMahon and Will Sutherland for information regarding 
the VISTA public surveys. MB is supported by an STFC stu- 
dentship. FBA acknowledges support from the Leverhulme 
Foundation through an Early Careers Fellowship. 



REFERENCES 

Abdalla F. B., Amara A., Capak P., Cypriano E. S., Lahav 

O., Rhodes J., 2007, ArXiv e-prints, 705 
Arnaboldi et al. 2007, The Messenger, 127, 28 
Babbedge et al. 2004, MNRAS, 353, 654 
Bassett B. A., Nichol B., Eisenstein D. J., 2005, Astronomy 

and Geophysics, 46, 26 
Bem'tez N., 2000, ApJ, 536, 571 



Bishop C. M., 1995, Neural Networks for Pattern Recogni- 
tion (New York: Oxford Univ. Press) 
Blake C, Bridle S., 2005, MNRAS, 363, 1329 
Blake C, CoUister A., Bridle S., Lahav O., 2007, MNRAS, 
374, 1527 

Blake C, Glazebrook K., 2003, ApJ, 594, 665 
Blake et al. 2006, MNRAS, 365, 255 

Bolzonella M., Miralles J. M., Pello R., 2000, A&A, 363, 
476 

Bruzual A. G., Chariot S., 1993, ApJ, 405, 538 
Budavari et. al 1999, in Weymann R., Storrie-Lombardi L., 
Sawicki M., Brunner R., eds. Photometric Redshifts and 
the Detection of High Redshift Galaxies Vol. 191 of Astro- 
nomical Society of the Pacific Conference Series, Creating 
Spectral Templates from Multicolor Redshift Surveys, pp 
19-+ 

Calzetti D., 1997, AJ, 113, 162 
Capak et al. 2004, AJ, 127, 180 

Coleman G. D., Wu C. C, Weedman D. W., 1980, ApJS, 
43, 393 

CoUister A. A., Lahav O., 2004, PASP, 116, 345 
Connolly et al. 1995, AJ, 110, 2655 

Cowie L. L., Barger A. J., Hu E. M., Capak P., Songaila 
A., 2004, AJ, 127, 3137 
Csabai et al. 2003, AJ, 125, 580 

Davis et al. 2003, in Cuhathakurta P., ed.. Discoveries 
and Research Prospects from 6- to 10-Meter-Class Tele- 
scopes II. Edited by Cuhathakurta, Puragra. Proceedings 
of the SPIE, Volume 4834, pp. 161-172 (2003). Vol. 4834 
of Presented at the Society of Photo-Optical Instrumenta- 
tion Engineers (SPIE) Conference, Science Objectives and 
Early Results of the DEEP2 Redshift Survey, pp 161-172 
Eisenstein D. J., Hu W., 1998, ApJ, 496, 605 
Feldman H. A., Kaiser N., Peacock J. A., 1994, ApJ, 426, 
23 

Feldmann et al. 2006, MNRAS, 372, 565 

Fioc M., Rocca-Volmerange B., 1997, A&A, 326, 950 

Firth A. E., Lahav O., Somerville R. S., 2003, MNRAS, 

339, 1195 
Hu W., 1999, ApJ Lett., 522, L21 
Kinney et al. 1996, ApJ, 467, 38 
Le Fevre et al. 2004, A&A, 428, 1043 
Le Fevre et al. 2005, A&A, 439, 845 

Lin H., Cunha C, Lima M., Oyaizu H., Frieman J., CoUis- 
ter A., Lahav O., Dark Energy Survey 2004, in Bulletin 
of the American Astronomical Society Vol. 36 of Bulletin 
of the American Astronomical Society, Photometric Red- 
shift Simulations for the Dark Energy Survey, pp 1462 — h 

Lin H., Yee H. K. C, Carlberg R. G., Morris S. L., Sawicki 
M., Patton D. R., Wirth G., Shepherd C. W., 1999, ApJ, 
518, 533 

Ma Z., Hu W., Huterer D., 2006, ApJ, 636, 21 
Oyaizu H., Cunha C, Lima M., Lin H., Frieman J., 2006, 
in American Astronomical Society Meeting Abstracts 
Vol. 208 of American Astronomical Society Meeting Ab- 
stracts, Photometric Redshifts for the Dark Energy Sur- 
vey, pp 60.03 — h 
Padmanabhan et al. 2005, MNRAS, 359, 237 
PoU et al. 2003, ApJ Lett., 593, LI 
Riess et al. 1998, AJ, 116, 1009 
Rowan-Robinson M., 2003, MNRAS, 345, 819 
Seo H.-J., Eisenstein D. J., 2003, ApJ, 598, 720 



18 Banerji et al. 



Spergel et al. 2007, ApJS, 170, 377 
Tegmark et al. 2004, ApJ, 606, 702 

The Dark Energy Survey Collaboration 2005, ArXiv As- 
trophysics e-prints 
Wirth et al. 2004, AJ, 127, 3121 



